Generalized K-fan Multimodal Deep Model with Shared Representations

نویسندگان

Gang Chen

Sargur N. Srihari

چکیده

Multimodal learning with deep Boltzmann machines (DBMs) is an generative approach to fuse multimodal inputs, and can learn the shared representation via Contrastive Divergence (CD) for classification and information retrieval tasks. However, it is a 2-fan DBM model, and cannot effectively handle multiple prediction tasks. Moreover, this model cannot recover the hidden representations well by sampling from the conditional distribution when more than one modalities are missing. In this paper, we propose a Kfan deep structure model, which can handle the multi-input and muti-output learning problems effectively. In particular, the deep structure has K-branch for different inputs where each branch can be composed of a multi-layer deep model, and a shared representation is learned in an discriminative manner to tackle multimodal tasks. Given the deep structure, we propose two objective functions to handle two multi-input and multi-output tasks: joint visual restoration and labeling, and the multi-view multi-calss object recognition tasks. To estimate the model parameters, we initialize the deep model parameters with CD to maximize the joint distribution, and then we use backpropagation to update the model according to specific objective function. The experimental results demonstrate that the model can effectively leverages multi-source information and predict multiple tasks well over competitive baselines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Emotion Recognition Using Multimodal Deep Learning

To enhance the performance of affective models and reduce the cost of acquiring physiological signals for real-world applications, we adopt multimodal deep learning approach to construct affective models from multiple physiological signals. For unimodal enhancement task, we indicate that the best recognition accuracy of 82.11% on SEED dataset is achieved with shared representations generated by...

متن کامل

Multimodal sparse representation learning and applications

Unsupervised methods have proven effective for discriminative tasks in a singlemodality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between modalities. The framework can model relationships at a higher level by forcing the shared sparse representation. In particular, we propose the use of joint dictionary l...

متن کامل

Improved Multimodal Deep Learning with Variation of Information

Deep learning has been successfully applied to multimodal representation learning problems, with a common strategy to learning joint representations that are shared across multiple modalities on top of layers of modality-specific networks. Nonetheless, there still remains a question how to learn a good association between data modalities; in particular, a good generative model of multimodal dat...

متن کامل

Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering

Visual question answering (VQA) is challenging because it requires a simultaneous understanding of both visual content of images and textual content of questions. To support the VQA task, we need to find good solutions for the following three issues: 1) fine-grained feature representations for both the image and the question; 2) multi-modal feature fusion that is able to capture the complex int...

متن کامل

Effect of Blade Design Parameters on Air Flow through an Axial Fan

The objective of this paper is the numerical study of the flow through an axial fan and examining the effects of blade design parameters on the performance of the fan. The axial fan is extensively used for cooling of the electronic devices and servers. Simulation of the three-dimensional incompressible turbulent flow was conducted by numerical solution of the (RANS) equations for a model. The S...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1503.07906 شماره

صفحات -

تاریخ انتشار 2015

Generalized K-fan Multimodal Deep Model with Shared Representations

نویسندگان

چکیده

منابع مشابه

Multimodal Emotion Recognition Using Multimodal Deep Learning

Multimodal sparse representation learning and applications

Improved Multimodal Deep Learning with Variation of Information

Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering

Effect of Blade Design Parameters on Air Flow through an Axial Fan

عنوان ژورنال:

اشتراک گذاری